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Likelihood Inference for Exponential-Trawl 
Processes 


Neil Shephard and Justin J. Yang 


Abstract Integer-valued trawl processes are a class of serially correlated, stationary 
and infinitely divisible processes that Ole E. Barndorff-Nielsen has been working on 
in recent years. In this Chapter, we provide the first analysis of likelihood inference 
for trawl processes by focusing on the so-called exponential-trawl process, which 
is also a continuous time hidden Markov process with countable state space. The 
core ideas include prediction decomposition, filtering and smoothing, complete-data 
analysis and EM algorithm. These can be easily scaled up to adapt to more general 
trawl processes but with increasing computation efforts. 


1 Introduction 


In recent years. Ole E. Barndorff-Nielsen has been working on a class of stochastic 
models called integer-valued trawl processes. References include 0 , a and 0. 
These are flexible models whose core randomness is driven by Poisson random 
measures. Trawl processes are related to the up-stairs processes of lf25l and the 
random measure processes of |24l . Both of these processes are stationary. 0 also 
brings out the relationship between their processes and M/G/°° queues (e.g. m, 
Pin and 0 Ch. 6.31]) and mixed moving average processes (e.g. 11221 ). Related 
discrete time count models include in. m, mo, da. am 03, m, ma, on 
and 1231 . Trawl processes also fall within the wide class of the so-called ambit fields 
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(e-g. 0 and 0). Recently, flD models high frequency financial data by using a 
trawl process to allow for fleeting movements to prices in addition to an integer¬ 
valued Levy process proposed by (6). 

As far as we know, there is no existing literature that directly and completely 
addresses likelihood inference for these trawl processes—or equivalently the pre¬ 
diction based upon it. Even though there are a large number of papers that focus 
on likelihood inference for marked point processes (see ED for a survey), it only 
indirectly and partially describes trawl processes in terms of their jumps. A thor¬ 
ough likelihood inference for trawl processes needs to include the information in 
the initial value of the process. 

In this Chapter, we provide a thorough analysis of likelihood inference for 
integer-valued trawl processes and demonstrate the core ideas—prediction decom¬ 
position, filtering, smoothing and EM algorithm—by focusing on the so-called ex¬ 
ponential trawl. It is not only a simplification of the modelling framework but also 
an intellectually interesting special case of its own, as in this special case the result¬ 
ing trawl process is a continuous time hidden Markov process with countable state 
space. The theoretical analysis for the filtering and smoothing problems for this type 
of process has been discussed in details by GU and l20l , using the classical theory 
of Kolmogorov’s forward and backward differential equations. We particulary em¬ 
phasize that the resulting EM algorithm in this special case is exact in the sense that 
there are no discretization errors in its computation. 

The major goal of this Chapter is to derive filtering and smoothing results in the 
framework of trawl processes, so the analysis adopted here can be easily scaled up 
to adapt to the discussions of other general trawls or even the inclusion of a non¬ 
stationary component proposed in lETI . These general discussions will be dealt with 
elsewhere, for they require a significantly more sophisticated particle filtering and 
smoothing device. We also discuss non-negative trawl processes, which are particu¬ 
larly easy to work with. 

The structure of this Chapter is as follows. In Section [2] we remind the reader 
how to construct trawl processes using the exponential trawl. Section [3] includes 
details of how to carry out filtering and smoothing for these models. In Section[4j we 
show likelihood inference for exponential-trawl processes based on these filters and 
smoothers. Section |5]discusses the important but analytically tractable special case 
of non-negative trawl processes. We finally conclude in Section [6] The Appendix 
contains the proofs and derivations of various results given in this Chapter. 


2 Exponential-Trawl Processes 

In this Section, we build our notation, definitions and key structures for the exponential- 
trawl process that will be focused on throughout this Chapter. We also provide its 
log-likelihood function based on observed data. 
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2.1 Definition 

Our model will be based on a homogeneous Levy basis on [0,1] x 1 1 —> Z\{0}, 
which models the discretely scattered events of integer size (with direction) y £ 
Z\ {0} at each point in time sgR and height x £ [0,1], It is defined by 

L (d.r, d.s) = f yN (dy,dx,ds), (x,s) £ [0,1] x R, 

J —oo 

where IV is a three-dimensional Poisson random measure with intensity measure 
E (N (dv, dv. dv)) = v (dy) dvd.v. 

Here ds means the arrival times are uniformly scattered (over R), dv means the ran¬ 
dom heights are also uniformly scattered (over [0,1]) and v (dv) is a Levy measure 
concentrated on the non-zero integers Z\ {0}. Without any confusion, we will abuse 
the notation v (y) to denote the mass of the Levy measure centered at y. Throughout 
this Chapter, we assume that 



E 

yez\{0} 


v (y) < oo. 


Following 0, we think of dragging a fixed Borel measurable set A C [0,1] X 
(—oo,0] through time 

A t =A + (0,f), f>0, 

so the trawl process is defined by 

Y t = L(A t )= [ \a (x,s — f)L(cbc,ds). 

J[0,l]xR 

Throughout the rest of this Chapter, we will focus on the exponential trawl 

A = | (x, s) : s < 0, 0 < x < d (s) = exp (s(j )) |, 0 > 0, 

to simplify our exposition of the key ideas. We will leave results on more general 
trawls in another study. 

Example 1. Suppose that 


v(dy) = v + 5 {1} (dy) + v 5 { _ 1} (dy), V+,V >0, 

where <5{±i) (dy) is the Dirac point mass measure centered at ±1. The correspond¬ 
ing L(dv, ds) is called a Skellam Levy basis, while the special case of v~ = 0 is 
called Poisson. The upper panel of Fig. [I] shows events in L using v + = V = 10, 
taking sizes on 1,-1 with black and white dots respectively and with equal proba¬ 
bility. The lower panel of Fig. [T| then illustrates the resulting Skellam exponential- 
trawl process Y t = L(A t ) using (j) = 2, which sums up all the effects (both positive 
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Time 

Fig. 1 A moving trawl A, is joined by the Skellam Levy basis L(dt, ds), where the horizontal axis 
s is time and the vertical axis x is height. The shaded area is an example of the exponential trawl 
A, while we also show the outlines of A t when t = 1/2 and t = 1. Also shown below is the implied 
trawl process Y, = L(A t ). Code: EPTprocess_Illurstration . R 


and negative) captured by the exponential trawl. Dynamically, L(A t ) will move up 
by 1 if the moving trawl A t either captures one positive event or releases a negative 
one; conversely, it will move down by 1 if vice versa. Notice that Yq= L (Ao) might 
not be necessarily zero and the path of Y at negative time is not observed. 


2.2 Markovian Counting Process 

For y £ Z\ {0}, let £ {0,1,2,...} be the total counts of surviving events of size 
y in the trawl at time t, which also includes the event that arrives exactly at time t, so 
each cf vJ must be cadlag (right-continuous with left-limits). Then clearly the trawl 
process can be represented as 

Y,= £ yC t (y) , t > 0. (1) 

yez\{0} 
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Note that each cj y> is not only a Poisson exponential-trawl process with (differ¬ 
ent) intensity of arrivals v (y) (and sharing the same trawl) but also a M/M/°° queue 

and hence a continuous time Markov process. Hence, for { ffP * 1 being the natu- 

1 J t >o 


ral filtration generated by the counting process C^ y \ i.e., 

it has (infinitesimal) transition probabilities (or rates or intensities) 




dr— >o dr 


v(y), 


4>cP, 

o, 


if 7 = 1 
if 7 = -1 
ifj€Z\{-l,l} 


( 2 ) 


The cases of j = 1 or — 1—which correspond to the arrival of a new event of size y 
and the departure of an old one—are the only two possible infinitessimal movements 
of Cf' * due to the point process nature of the Levy basis. Note that the arrival rate 
and departure rate are controlled by the Levy measure v and the trawl parameter tj) 
respectively. Derivation of Q can be found in many standard references for queue 
theory (e.g. ID). 


Remark 1. Let AX t = X, X, denote the instantaneous jump of any process X at 
time t. Then the transition probability ([2]) can be conveniently written in a differen¬ 
tial form 





v(y)dr, 

0, 


if 7 = 1 
if 7 = -1 
if./ez\{-i,i} 


Throughout this Chapter, our analysis will be majorly based on this infinitessimal 
point of view for the ease of demonstration. All of our arguments can be rephrased 
in a mathematically tighter way. 


The independence property of the Levy basis implies the independence between 
each cj } 1 for y £ Z\ {0}, so the joint count process 


C,^ 


( A-2) A- 1) r (i) A2) \ 

l - --»*—r j'-'t iW j'-? 


is also Markovian, which serves as the unobserved state process for the observed 
hidden Markov process Y, and will be the central target for the filter and smoother 
we will discuss in a moment. Let % = a ({C s } 0<s<; ) = \J ^ j 0 j be the join 

filtration. Clearly, from Q, C, has (infinitesimal) transition probabilities 

( v (y) df, if j = lM for some y 

P(AC f =j\%~) = < if j = —lW forsomey , (3) 

( 0, otherwise 


where 1W g Z°° is the vector that takes 1 at y-th component and 0 otherwise. 
The trawl process Y t can be also written as 
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r ; = £yyW, y^cW-c^), 

y=l 

where each K/'' 1 is a Skellam exponential-trawl process. Each K ; ' : is observed from 
the path of Y t up to its initial value Y^\ for we can exactly observe all the jumps of Y, 

and hence allocate them into the appropriate Y, ly> . In other words, we can regard the 
observed trawl process as (i) a marked point process AY t £ Z\ {0}, which consists 
of several independent (given all the Y^ 1 ) marked point process AY," 1 £ {—1,1}, 
plus (ii) the initial value Yq. The missing components K 0 ‘ 1 ’s will have some mild 
effects on A Yp ''. It is this initial value challenge that differentiates the likelihood 
analysis of trawl processes from that of marked point processes. 

The special case where Y, is always non-negative has further simpler structure, as 
we must have c\ yi = 0 for all y = 1,2,... and hence C" ! = Yp ’ is directly observed 
up to its initial condition Cq \ which can be well-approximated if the observation 
period T is large enough. We will go through these details in Section]?] 


2.3 Conditional Intensities and Log-likelihood 

Let {&,} t>0 be the natural filtration generated by the observed trawl process Y t , 
i.e. cP t = G ({K v } q< s</ ) • Define the cadlag conditional intensity process of the trawl 
process Y as 


A W 4 


Hm P(y,-y,-dr=y|^-d/) 

dr->o dr 


y G Z\{0}, f > 0 


or conveniently in a differential form 

= ¥(AY t =y\Jf t -). 


(4) 

(5) 


It means the (time-varying) predictive intensity of a size y move at time t of the 
trawl process, conditional on information instantaneously before time t. 

Remark 2. To emphasize the ^-predictability of X' y> , i.e., being adapted to the left 
natural filtration J^ r _, we will keep the subscript t— throughout this Chapter. This is 
particularly informative in the implementation of likelihood calculations, reminding 
us to take the left-limit of the intensity process whenever there is a jump. 

For any two cr-fields & and let the Radon-Nikodym derivative over 
between two probability measures P and Q be 
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In particular, when & = o (X) for any random variable X, we will simply write 
the subscript as ^\X. The following classical result serves as the foundation for all 
likelihood inference for jump processes. 

Theorem 1. Let X t be any integer-valued stochastic process and {■'Xf } f () be its 
associated natural filtration. Assume that, under both P and Q, (i) it has finite ex- 
pected number of jumps during (0, T], and (ii) the conditional intensities Ay ' and 
A^’ 1 ® are well-defined using (Q and Then P <C Q over APj\Xq if and only if 
is strictly positive. In this case, the logarithmic Radon-Nikodym derivative 
over |Aq is 


log 



E E l0 8 

o<f<r y 6 z\{o} 



L 


y 

r6 ( 0 ' r ] y 6 Z\{0} 


(4 




| l {AX,= y } 

A^’ Q ) dt. 


Proposition 14.4.1 of CU provides a complete and mathematically rigorous treat¬ 
ment for this Theorem. For completeness, we also provide an intuitive and heuristic 
derivation in the Appendix. A direct application of Theorem |T| gives the following 
Corollary. 


Corollary 1. The log-likelihood function of the (general) trawl process is (ignoring 
the constant) 

y T {o)= E E iogA f Wi {AYl=y} ~ f E A r w d t+i Yo (d), (6) 

o<i<r y ez\{o} 7*6(0,r] yeZ \{ 0 } 


where the parameters of interest 0 include the Levy measure V (dy) (i.e. V (y) ’s) and 
the trawl parameter (f>. 


The study of likelihood inference for trawl processes then reduces to the cal¬ 
culations of conditional intensities Ay!: 1 for y £ Z\ {()}. Now, by law of iterated 
expectations and the fact that % D for all t (because of (|T]|), we have 


A^df = E(P(AT, =y\%-) \& t ~) 


= e(p(ac, = 

= v(y)df + 0E ( 


iW| + e(p( 



AC, = 


d t 
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where the second line follows because the event AY t = y must come from either an 
arrival of a new size y event or a departure of an old size —y event; the third line 
follows from ([3]). Thus, 


1 { /1 =v{y)+4>E(ci y) 



y G Z\{0}. 


(7) 


In next Section, we will study an exact filtering scheme to numerically calculate 


HO 




E(C? 

The non-negative exponential-trawl process, where we always have positive 
events, admits a further simplification 


A^ = v(y), A t ( _ ,) = ^E(c^|^ t _), y= 1,2,..., 


( 8 ) 


so likelihood inference for such a case is easier. In the Poisson case, all the impacts 
are of size one, so in particular Cq 1 = To is also observed (as Cq 1 = 0 for all y / 1), 
which allows us to bypass the conditional expectation in ([8]) for y = 1. 


3 Exact Filter and Smoother for Exponential-Trawl Processes 
3.1 Filtering 

In general we need to solve the filtering problems for C t to implement ([hji and ([7]). 
Denote the filtering probability mass function as 

Pt,s (j) — P ( Cy =j| & s ), j = (-,7-2,7-1,71,72,-), jy =0,1,2,..., t,s> 0. 

Also, let IIJUj 4 £ yezm j y and D, 4 ||C,|| t = I y6Z \ {0} C f W . 

Our goal here is to sequentially update pt-,t- (j), where the initial distribution is 
derived from 


Cq > m ^ P ' Poisson f subject to ^ yC q * = To, 

\ r / yez\{0} 

so, by letting Poisson (x|A) = k x e~^/x\, we have 

n y ez\{ 0 } Poisson (j y \v Cy) /0) 

po,o(j) = • r - j -. —v—, 

IP (Lyez\{0} yC 0 — to J 

where the denominator can be numerically calculated using the inverse fast Fourier 
transform ED- 
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Notice that the filtering distribution not only updates at the times when the pro¬ 
cess jumps but also at those inactivity periods. We discuss these two cases sepa¬ 
rately. 


Theorem 2 (Forward Filtering). 

1. [Update by inactivity] Assume that the last jump time is X (or x = 0) and the 
current time is t—, where AY S = 0 for X < s <t (and AY X 0 if X > 0). Then 


^Ililli^VrCj) 






(9) 


where p X}X is the filtering distribution we have already known at time X. 

2. [Update by jump] Assume that the current time is X — and AY X = y for some 
y £ Z\ {0}. Then 


Pt,A j) = -yy {y(y)Pt- T- (j-l W ) +^U-y + l )PT-,T- (j+l ( y) )), 


( 10 ) 

where is the filtering distribution we have already known at time X—. 

Overall, the filtering procedures (|9]) and (101 imply that p, , (j) can be updated 
in continuous time without discretization errors at any set of finite discrete time 
points, so we call it an exact filter. 


Example 2. For Skellam exponential-trawl process with Levy intensities v + and v , 
we always have 


Yt- 


= c£ ) -C t L ) 


1 > 0 , 


so knowing Pt-.t- (j) — P ^ cj_ 1 = j j immediately gives us pt-,t- (j,k). 
Hence, the hltering updating scheme reduces to the following: starting from x = 0, 


Pt-t- (j) « e-^j+W-^p^ (j) if 4T, = 0 for x < s < t, 

Px,tU) Y + Px-,t-U) + (I>U+ 1 )Pt:-,t:-U+ 1 ) iiAY x = \, 
PT.rU)^ v~p x -, x -(j-1) + <p {j+ y x -) p T - tX -(j) if ay x = -\. 


We then renormalize pt-,t- (j) suc h that YjJ=oPt-,t- ( j ) = 1 in each step of the 
updates. Knowing the filtering distributions p t -j- ( j ) allows us to calculate 


e(q ( J 



&t~) = Y IPt-f- U)+Yt— 
j =0 


Using the following settings, with time unit being second, 

v+= 0.013, v~ =0.011, 0= 0.034, T = 21 x 60 2 = 75,600 (sec.), (11) 
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Fig. 2 Top left'. A simulated path for the Skellam exponential-trawl process Y t . Top right, 
Bottom left. Bottom right'. Paths of the true hidden counting processes cl + \ cj 1 and D t = 
cf + * + c\ of surviving events in the trawl along with their filtering estimations. Code: 

EPTprocess_FilteringSmoothing_Illustration.R 


Fig. [2] shows a simulated path of the trawl process Y, together with the filtering 
expectations of c\ + \ cl * and D t = c\ +> + c\ \ the total number of surviving 
(both positive and negative) events in the trawl at time t. 


3.2 Smoothing 

We now consider the smoothing procedure for the exponential-trawl process Y r , 
which is necessary for the likelihood inference based on the EM algorithm we will 
see in a moment. 

Running the filtering procedure up to time T , we then start from pjj to conduct 
the smoothing procedure. 

Theorem 3 (Backward Smoothing). 

1. [Update by inactivity ] Assume that the ( backward) last jump time is z (or T = T) 
and the current time is t, where AY$ ^ 0 for t s ^ T (and AY^ ^ 0 if T ^ T). 
Then 
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Ptj{ j) =Pt-,tQ)> 

where p T -,r is the smoothing distribution we have already known at time T—. 

2. [Update by jump] Assume that the current time is X and AY r = y for some y £ 
Z\ {0}. Then 


Pt-,t (j) 


Pz-,z- (j) 



PzJ (j+l W ) 

Pt.tO + I^) 


Pr,r(j- l( y) ) \ 
Pz,t{i- 1( ~ y) ) J 


, (12) 


where ant/ Pz,z are from the forward filtering procedure and p T j is the 

smoothing distribution we have already known at time X. 


The two terms in l [T2| ) are 

'c t _=j,C t =j + lW|^ r ) andP(c T _=j,C T =j-lW|^ r ) 


respectively, so, in particular. 


4C« = 1 


' AC^ ] = -1 




P T-,T- (j) 


j ^ 


w 

T- 




Pz-.z- 0) 


v(y) 


$j-y 


Pr,r(j + l W )' 

PT,T(j + l w ) J' 

Pz,T (j ~ j 
Pz,z{i-i { ~ y) ) 


(13) 

(14) 


These (total) weights in ( [T2| will be recorded for every jump time x as by-products 
of the smoothing procedure, for later they will play important roles in the EM algo¬ 
rithm introduced in Subsection [43] 

Example 3 (Continued from Example [2]). 

For Skellam exponential-trawl process, the smoothing updating scheme reduces 
to the following: starting from x = T, 


Pt,T O') 
Pz-T (j) 


Px-T 0) 


= Pz-j ( j ) if AY s = 0 for t < s < x 

Pz~,z~ O') 


v + Pr, r Q) PT,rO-l) 
Pz,z{j) VJ Pz,zU~ l ) 


if AY X = 1, 


/ .\ I Pz,T (7 + 1 ) . , fv , ~Pz,tU) . 

Pz-,z- 0) v -/ • . , n +0 (It- + ./) -rrr I lf4T T — 


Pt.t 0+1) 


Pz.z 0) 


- 1 . 


We also renormalize p t j ( j ) in each step of the updates. 

Using the same simulated path and the same setting CQ} as in Example [2] we 
show the smoothing expectations of c\ : , c\ 1 and I), in Fig. [i] For most of the 
time, the smoothing expectations can match the truth quite well and will remove 
those peaks of filtering expectations resulted from departures (such as the one close 
to t = 400 in the plot for C, 1 *). 
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Y, 




0 100 200 300 400 500 600 



D, 



Time (sec.) 


Fig. 3 Top left'. A simulated path for the Skellam exponential-trawl process Y,. Top right, 
Bottom left. Bottom right'. Paths of the true hidden counting processes C^ + \ cj 1 and D t = 
cf + * + Cf 1 of surviving events in the trawl along with their smoothing estimations. Code: 

EPTprocess_FilteringSmoothing_Illustration.R 


Now we are capable of conducting likelihood inference for exponential-trawl 
processes as one of the most important applications of the filtering and smoothing 
procedures we have already built here. 


4 Likelihood Inference for General Exponential-Trawl Processes 

It has been reported by 0 and EU that the moment-based inference for the family 
of trawl processes could be easily performed, but such inference is arbitrarily depen¬ 
dent on its procedure design. In this Section, we focus on the maximum likelihood 
estimate (MLE) calculation for exponential-trawl processes with general Levy basis 
and demonstrate its correctness using several examples. 
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4.1 MLE Calculation based on Filtering 


Recall that the evaluation of the log-likelihood ([6]i requires the calculations of the 
conditional intensities and their integrals 


f X^dt = v(y)T + <t>Y J j- y [ Pt—f— (j) d/, 

Jte{ 0 ,r] : ' Jte(o.T] 


te(0,T] 


(15) 


which follows from 

However, we do not know the integral J ic (q - r Pi - j (j) dr analytically, as the de¬ 
nominator in ([9]) also depends on t. Hence, we have to calculate ([9]) in a dense grid of 
time points—separated by a time gap 5 inactivity during those inactivity periods—and 
approximate (JT3J) by linear interpolation. Clearly, the smaller the time gap <5inactivity, 
the smaller the numerical error in ( p~5] > but the larger the computational burden. 

Example 4. Using the true parameters in <Ug and simulating a 10-day-long data 
with T = 756,000 (sec.). Fig. [4] shows how an inappropriate choice of 5 inactivity 
will depict a wrong log-likelihood surface no matter how long the correct simulated 
data we supply, where the comparison is made with respect to the first day portion 
(75,600 (sec.)) of the 10-day-long simulated data. Using the same one-day-long 
data. Fig. [5] also shows the corresponding log-likelihood function over v + or V 
with other parameters fixed at the truth. Including the bottom left panel of Fig. [4] all 
of the MLE’s (solid lines) are reasonably close to the true values (dashed lines) and 
the likelihood ratio tests suggest that p-values are all greater than 20%. 


i and E ( c[_ y) = Ej j- y Pt-,t- (j). 


4.2 Complete-Data Likelihood Inference 

Even though in general it would be computationally expensive to calculate the MLE 
by direct filtering, the maximum complete-data likelihood estimate (MCLE) is much 
simpler. A comprehensive analysis of the complete-data likelihood inference is per¬ 
formed in the following. 

Let N^' A and N^' D be the counting process of the temporary arrival of size y 
events and the departure of old size y events during the period (0, T], Also let 

^typeA £ type = AJA 

yez\{0} 

Theorem 4. The complete-data log-likelihood function of the exponential-trawl 
process is (ignoring the constant) 
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Fig. 4 Log-likelihood plots over <j> (with v + and v~ fixed at the truth) using different 5h,activity 
and a simulated 10-day-long (T = 756,000 (sec.)) Skellam exponential-trawl process. The one- 
day-long data is the first tenth of the simulated data. The dashed lines indicate the true value of 0, 
while the solid lines indicate the optimal value of 0 in each plot. The p -values using the likelihood 
ratio test are 0.104% (Top left), 21.0% (Bottom left), 8.82 x 10~ 13 (Top right) and 46.1% (Bottom 
right). Code: EPTprocess_MLE_Inference_Simulation_Small_vs_Large . R 


(0)= £ (log(v(y))(4 v) ’ A + C^ ) )-v(y)(7' + 0- 1 )) (16) 


yez\{0} 


+ log(0) (N$-D 0 )-0 ( 

Jh 


re (0,7-] 


Dt-dt, 


so the corresponding MCLE’s for the Levy measure and the trawl parameter are 

(17) 


N b>)A , c (y) 

vmcle (y) = —^-— ; 0 ; yez\{0}, 


T + <jt mcle 


0 


, T+ ,i^l±H Jr 


te(o,T] 


Z) f _dr 


MCLE 


E r ±N?-D 0 - 


^/re(o,r] Df-dt 

1 r 


T Jte(0,T] 


A-dr. 
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"O 

o 

o 



v + v' 


Fig. 5 Log-likelihood plots over either v+ or v~ for one simulated Skellam exponential-trawl 
process. The dashed lines indicate the true value, while the solid lines indicate the optimal value of 
V + or v~ in the individual plot. The p-values using the likelihood ratio test are 40.5% {Left) and 
33.4% (Right). Code: EPTprocess_MLE_Inference_Simulation_Small_vs_Large . R 


Furthermore, the MCLE’s above are strong consistent: with probability 1 , asT —> °° 

0mcle 0 ai, d Vmcle (y) y £ Z\{0}. 

We note that 0 mcle depends on T I), _, the total number of possible depar¬ 
tures, weighed by time, at risk during the period (0,7’]. 


4.3 MLE Calculation based on EM Algorithm 

In this Subsection, we introduce an EM algorithm that is particularly suitable for 
exponential-trawl processes, as there are no discretization errors. The EM algorithm 
is also computationally efficient. Compared with generic optimization methods like 
limited-momory BFGS (L-BFGS), the updating scheme suggested by EM can con¬ 
verge to the MLE in a fewer steps and with no error. Clearly, the use of EM needs 
some extra computations in each step for backward smoothing, but in aggregate 
EM performs much faster than L-BFGS as EM skips those intermediate filtering 
calculations during those inactivity periods. 

/-’-Step The linear form of the complete-data log-likelihood ( [T6| allows us to eas¬ 
ily take expectation on it with respect to P(-|J^V) (under a set of old estimated 
parameters 0 o id), which then requires the calculations of the following quantities 
using the smoothing distribution p, j-: 
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Table 1 The MLE calculations on one simulated Skellam exponential-trawl process us¬ 
ing L-BFGS-B procedure in R (with default settings) and EM algorithm (with uni¬ 
form tolerance 10~ 6 on the parameter space). The R elapsed time is 137.4 (sec.) 
for L-BFGS-B and 3.3 (sec.) for EM, which is about 40 times speed up. Code: 

EPTprocess_MLE_Inf erence_Simulation_LBFGS_vs_EM. R 


Estimation 


Parameter 


Log-likelihood 

v + 

V - 


^inactivity — 0-5 

^inactivity — 0.01 

Truth 

0.01260 

0.01111 

0.03402 

-15,974.98 

-15,974.9543 

L-BFGS-B 

0.01201 

0.01128 

0.03362 

-15,973.92 

-15,973.8915 

EM 

0.01199 

0.01126 

0.03354 

-15,973.91 

-15,973.8881 


& t ) = E p ( 4C t 

0<t<T V 


E (a 4 v),A &t) 

e(a4 v),d &t) = E p ( 4 Q l 


(y) 


= 1 


*t), 


( 18 ) 


= -1 


0<t<T 


»T ) , 


E ( 


( CJ| &t) = E JyPoj (j), E (D 0 \ & T ) = E IIJII i Po,t (j), 


E(ZV|^r)=ElljlliA-,r(j), 

j 


where © and ( [l4) i will be extensively used. Note that E(D,. \-^r) will be a 
step function of t, so the calculation of f te ( 0 E (A- |^r) df is trivially exact. 

M-Step Since the /-’-Step generates a Q function that takes the same functional 
form of 6 as the solution to M-Step takes the same form as the MCLE in 
(17 i, where we just replace each of the hidden data related terms by their smooth¬ 


ing expectations in (p~ 8 j). This can be also viewed as a representation of plug-in 


principle for (171, i.e., replacing those unknown quantities (e.g. 1 


the known ones (e.g. P = 1 

M-Step for next iteration. 


{4C, W = l} 


) by 


&t )). We further use the solution of this 


Example 5 (Continuedfrom Example^. 

Using the same simulated Skellam exponential-trawl process path, Table[l]com- 
pares the MLE derived from (i) the L-BFGS-B procedure in the optim function 
of the R language (using the default tolerance settings) with that from (ii) the EM 
algorithm (using the same initial parameter value), which stops until each parameter 
differs less than a uniform tolerance 10 6 . 

As expected, using the EM algorithm gives estimation values that are very close 
to the direct optimization of the log-likelihood function (using 5 inactivity = 0.5). An 
interesting feature here is that the MLE found by the EM algorithm has a slightly 
larger log-likelihood value (even for 5inactivity = 0.01) than by the L-BFGS-B, 
which might attribute to the numerical insufficiency of the default optimization tol¬ 
erance setting of R. 
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The L-BFGS-B procedure uses 27 evaluations of the filtering procedure (9 of 
them for objective function evaluations and 18 of them for numerical gradients); 
as a comparison, the EM algorithm takes 12 evaluations of the filtering procedure 
plus 12 more of the smoothing procedure. In aggregate, the EM algorithm is over 
40 times faster than the L-BFGS-B in terms of the computation time. 

Starkly different from Example [TJ the EM algorithm does not require the fine 
evaluation of the integrals of X'f}, so not only the filtering procedure in each it¬ 
eration of the EM is faster (as it skips the grid calculations of during those 
inactivity periods) but also the convergent result of EM will maximize the numeri¬ 
cally errorless log-likelihood (as it has nothing to do with ^inactivity to conduct EM). 
As a conclusion, using EM algorithm to search the MLE for exponential-trawl pro¬ 
cesses will dominate the direct optimization of log-likelihood both on the numerical 
quality and on the computation speed. 


4.4 Likelihood Inference without the Initial Information 

If we consider the complete-data log-likelihood given the information Co, i.e. 
1%' T \C Q (0), then the MCLE’s are even simpler: 




Vmcle(t) = ' T • ^mcle 


Jtefo.rj D/-dt 


Note that these estimates are the most natural frequency estimates providing that we 
know the hidden state process C f : v(y) is estimated by the sample intensity of all 
the arrivals of size y events, while © 1 is estimated by the average lifetime among 
all the departures of the temporary events, for the lifetime of any temporary event is 
exponentially distributed with mean 1 /(j). 

However, here is a subtle statistical inconsistency if one wants to build an EM 
algorithm based on l^ T \c 0 (0)- In practice, all the initial values Cf’s are unknown, 
so the only way we can work on l^ T \c 0 (9) is to treat them as nuisance parameters. 
Thus, the EM Q function is defined by 


Q(9',C' 0 \9,Co)=Ee(h TK (9 , )\Co,'? T ), 


which not only requires the smoothing scheme based on P@ (• | Co)—not Pg (• | J^V)— 

but also finally gives us the MLE of the joint log-likelihood function ljr T (O.Cq )— 
not the MLE of Ijr T (0) nor of l jsyiy 0 (0). On the other hand, one might also define 
the EM Q function as 


g(0'|0) =E e (%|c 0 (e , )l^), 


but in this case 
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Q(0\e) = y T \ Yo (0)-Ee (/co|y 0 (0) | &t) l*r\T 0 (0), 


which then breaks the fundamental monotonicity that guarantees the availability of 


EM: 


G(0 , |0)>G(0|0) = Wo(0)- 


y T \ y o (0*)>Q(0*|0) 


= max 
0 ' 


Therefore, even though the direct filtering allows the calculations of the MLE 
whenever we include the initial information To or not (i.e. to maximize ljr T (0) or 
l'Y T Y[) (0)), a correct EM-based inference will automatically enforce the considera¬ 
tion of To (i.e. to maximize (0) using EM). This is a bit different from likelihood 
inference for marked point processes, which usually ignores the effect of the initial 
value To. This mild difference will clearly disappear asymptotically as T —> °°, but 
here we still prefer to present a complete likelihood analysis for trawl processes 
instead of treating them the same as marked point processes. 


5 Likelihood Inference for Non-negative Exponential-Trawl 
Processes 

In this Section, we focus on exponential-trawl processes that are always non¬ 
negative. Then all the negative movements of this type of processes must attribute 
to the departures of the positive events in the trawl, so it is natural to split up T into 
the counting process of size y jumps 


— J2 l{4r s =y}) y€Z\{0}, 




(19) 


Then, as mentioned in the end of Subsection|2.2| 


y, = iyc ?' = £ vcf + £>■(«,“ -», K,) ). 


y— 1 y= 1 y=l 


Clearly, the path of Y t reveals the path of each of the individual N,' 1 lor v € Z\ {0}, 
so N[ };> £ ,yF t - Thus, the only unknown objects here are C^’s, for we just see To = 
i^Cq^ and all the departures resulted from Cq^’s. If we can know Cq \ then we 

will see the complete path of cj y> and hence likelihood inference will be particularly 
tractable. 
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5.1 Partial Likelihood Inference 


We can specialize Corollary [T] using (|8]» and write down the log-likelihood for the 
non-negative case (ignoring the constant): 


1*tW= l( 1 og(v(y))^ ) -v(y): 

y=l V 

+ log (0) N { t - ] - <j> [ Eg (c£ ] ) At 

Jre(o,r] x / 

+ 52 ^l°gEe(c^ ^{AY t ~-y} +Iy 0 (0) , 


0<f<7>=l 


y) and 


where iV-j- ^ — L” 

Like the general case we studied in Section [4] there are no analytic expressions 
available for the filtering expectations Eg ^C,_ and the initial likelihood 

/y o (0), so finding 0mle also requires the EM techniques we introduced before. 
However, the first part of l,? T (0) that involves v (y)’s is particularly analytically 
tractable, so this leads us to consider the following maximum partial likelihood es¬ 
timate (MPLE) for the Levy measure: 

vmple (y) = y = 1,2,3,..., 

which is a non-parametric moment estimate that is apparent from the non-negative 
setting. 

Even though Vmple is not Vmle, it has several advantages. First, it has strong 
consistency, i.e., with probability 1, Vmple GO — > V (y) as T —> «>. Second, it is 
asymptotically equivalent to the MCLE, because 


VmCLE ()0 = 


Nt ) + c '$ ) 

T + 0mcle 


Nr 


w 


c, 


( y ) 


1 + 


0MCLE 


Vmple, 


where the MCLE of 0 is simply given from (17 1 but we need to replace those I) 


by Cff. Third, it allows to estimate each component of the Levy measure separately 
from themselves and from <jb, as given a long enough path of Y, including the ini¬ 
tial value Cq 1 and 0 mcle has no strong improvement on the estimation quality of 
Vmple- 

Alternatively, a parameterized common intensity function v(y|r|) can be used, 
where T] is some finite dimensional parameter. Then the MPLE is found by solving 








20 


Neil Shephard and Justin J. Yang 


^mple = argmax 

1 y=l 


E ( lo s ( v (y\"n))^T^ ~~ v(yl 7 ?) ^ 


and letting Vmple (y) = v (y|rj MPLE ). 

To infer on the trawl parameter (f>, we can simply plug-in the Vmple (either 
parametric or non-parametric) and then do the filtering procedure to calculate 


E 


0=(^MPLE 


,0) (Q ( 


(y) 




for y = 1,2,... and t £ (0,7"]. Combining this with an 


(one-dimensional) optimization procedure we can find 


0mple - argmaxZjTj, (v M ple,(>) • 
0 


5.2 Estimate the Missing Initial Missing Values 

Except the Poisson case (To = C^\ Cq 1 =0 for all y > 1 and hence in particular 
0mcle = 0mle). every C^’s are missing, so in principle we need to estimate these 
initial values in order to get (an approximation of) Pmcle- Indeed, the EM algorithm 
also does so through the smoothing expectations 

e(c£? ^ r )=E(c^ J? T )+lV t (y) -N t ( ~ y) , 


but it just iterates (17 1 until converges. Nevertheless, there is another simpler esti¬ 
mation of Cq 1 thanks to the special non-negative feature. 

The following Proposition only relies on the fact that Y t is non-negative and in 
fact does not depend on the choice of the trawl. 


Proposition 1. Assume that the (general) trawl process Y t is non-negative. If 


/-.(yhL a 
Cq j = sup 

re [0,7’] 


(n$ y) -N^ 


r (y),u a 

'-'0,7’ — 


y _y /pC y, )’ L 

To LvVyT T-0 ,T 


where Nq' ! = 0 conventionally and |xj means the integer part ofx, then 


'-o,r —'-o — '-'O.r ■ 


Furthermore, 


lunCg. U = limCg; L =cW. 

T -A-oo ’ T —^°° ’ 


Thus, a straightforward and sharp estimation to Cq 1 can be given by, e.g., 
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Ay) a 


C, 


W, u +c M,l 


0 ,T 


0 ,T 


so use this estimation in (17 1 will give an estimate of 6 that is almost as good as 

0MCLE- 


Example 6. Figure [6] illustrates Proposition |T| with a non-negative geometric Levy 
basis, where 


v(y|»?) = l|v||i?(l-t?) 3 ' \ y= 1 , 2 ,..., 

11v11 = 3, 77 = 0.5, 0 =0.5, T = 100. 

The paths of the upper bound Cq^’ U and the lower bound Cq/' L are shown as step 
functions of time 1 in Fig. [6] We can observe a strong convergent pattern, as all the 
bounds for different y converge after t > 15—the perfect estimations of the initial 
values Cq ] ’ s. Furthermore, as To = Cq 1 ^ +2C^ + 4Cq 4) in this case, all the other 
Cq ’’s for y / 1,2,4 must be zero. We then have discovered all the initial values and 
can use them to conduct MCLE by (fl7|). 


Y, 


->( 2 ) 




- Upper bound 

-Lower bound 

- Truth 



; 

1 1 

0 5 

l l 

10 15 

r (4) 



- Upper bound 


-Lower bound 


- Truth 


10 


15 



Time 


Fig. 6 Top left : A simulated path for the exponential-trawl process Y, using non-negative geometric 
Levy basis. Top right. Bottom left. Bottom right: Paths of Cg^’ U and Cq '?’ L along with the true Cq * 
fory = 1,2,4. Code: EPTprocess_NonNegativeInitialEstimate . R 
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6 Conclusion 

In this Chapter, we studied likelihood-based inference of the trawl processes by 
explicitly working on the filtering and smoothing procedures inherited from this 
model. It is plausible and practically implementable under the exponential trawl. 
We used some simulation examples to justify the correctness of our procedures. 

The major contribution of this Chapter is to provide an easiest beginning step 
toward likelihood inference for all of the other more general trawl processes, which 
might even allow the inclusion of a non-stationary Levy process component. m 
calls it a fleeting price process and extensively uses it for the study of high frequency 
financial econometrics. 

The filters for the fleeting price process they proposed will allow an econometri- 
cally interesting decomposition of observed prices into equilibrium prices and mar¬ 
ket microstructure noises. More empirical analysis about these will be addressed in 
the future work. 


Appendix: Proofs and Derivations 
6.1 Heuristic Proof of Theorem^ 

Our heuristic derivation starts from the following prediction decomposition of the 
Radon-Nikodym derivative: 



( 20 ) 


where the integral over t £ (0,7’] means a continuous sum of the integrand random 
variables. Thus, 




where the first equality follows because X f _ is known in the third equality 
follows from ([5]). Therefore, (|20[i can be rewritten as 
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Jte(o,T] \dQJx,\&x 



where the second equality follows from log (1 — x) w —x for small x and {t G (0. T : AX, / 0} 
has Lebesgue measure 0. 

6.2 Heuristic Proof of Theorem^ 

6.2.1 Update by inactivity 

We want to update p xz (j) by incorporating the information = cr({ziU 1 = 0, T < s < f}) 
using Bayes’ Theorem: 


P(C f _=j|^i_)=P(C t =j|JP t _)=P(C T =j|^ t ,^( V) ) 
P (^(T,r) \^T,C t = j)P(C T =j|^v)» 


where the first equality holds because there is no activity of Y s for s G (T.?) and 
hence the hidden state C must stay the same. 

Using the prediction decomposition, we have 



= E v (y) — T) — ^ IIjIIi (r — t) , 


y€Z\{0} 


where the second equality intuitively holds because we know the instantaneous de¬ 
parture probability of a size y event at time s is (j)C^ d.s but C^_ = = j y under 

Jthe third equality follows from log (1 — x) ~ —x for small x. Therefore, 


P(C f _=j|^_)- e 


<?>l|j|li (r—T)p ( c t =j|^V), 
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where we throw out the term exp (—L v ez\{o} V (y) (t — t)) because it doesn’t de¬ 
pend on j. Normalizing the equation above leads to the desired result. 


6.2.2 Update by jump 

We want to update p x .z- (j) by incorporating the piece of information, AY r = y. 
First note that 


P(C T =j|^ T ) = P(C T =j\^ x -,AY r =y) 
= p( C t =j,C T _=j-lW 


>^C T =j,C T _=j-l W & T _,AY x =y 

+P ( C T = j,C T _ = j + l ( -- y) | J?r-,AY r =y), 


which corresponds to the arrival of a new size y event and the departure of an old 
size — y event. 

For the first term. 


Pi 

p 

'c t =j,C t _=j-lW & x -,AYr=y) 
C T =j,C T _ = j-l ^,AY x =y 


p 

P(AYz=y\^ x -) 

'c T =j,AY t =y C T _ =j-lW,^ T -' 

3 

•n 

II 

1 

U 

SM 

p 

P(AY X = y\& x _ 
4C t = lW C t =j-lW,^- t )p( 

-) 

C T =j-lW .?z-) 

P(AY r =y\^ r .) 

= v Mp(c T _=j-iW JV), 


where the fourth equality follows from ([3J (using 'rf T D ,0? T ) and (|5j. 
Using similar arguments, the second term is 


P (c T = j,C T _ 



AY x = y) 


0 (j-y + 1) 



Combining all of these gives us the required result. 
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6.3 Heuristic Proof of Theorem^ 

The case of updating smoothing distribution p T ;r (j) due to inactivity is trivial 
because the hidden configuration C must stay unchanged because of the inactivity 
during the time period [f, z). 


6.3.1 Update by jump 

We now consider the case of (backward) updating the smoothing distribution p t r (j) 
due to the jump AY Z =y. Then 


C T _ = i,C T = i-l (_;y) 


P(C T _ =j|^r) = p(c t _ =j,C t =j + lW|#r) +p( 

= p(c t _=j|^r,C T =j + lW)p(c t =j + lM|^ r ) 
+p(c T _ = j|^ r ,C T =j-l(->))p(c t =j-l^) 




•'pi 


Note that 


’(C T _ =j|^r,C T = k) = P(C T _ =j|^T,C T =k) 

_ P(C T =k|C T _=j,^ T )IP(CT-=j|^T) 


( 21 ) 


P(C T = k|J? T ) 

P(C T = k|C T _ = j,J? t )P(AY r =y|C T _ 
xP(C T - =j|JV-) 

P(C T = kj& T )P(AY T =yj& T -) 
P(C T =k,4F T =y|C T _=j,jr T _)P(C T _=j|^ T _) 


Arldt 


P(C T = k|J? T ) ’ 


where the first equality holds due to the Markov property of C t , a heuristic derivation 
is given later; the second and third equalities follow from the Bayes’ Theorem. Since 

p(c t =j+iW,4y t = y|c t _=j,^_) =p(4C T = iM|c T _=j,jr T 

= V(y)d/, 

p(c T =j-l^,4F T =y|c T _=j,^ T _) = p(aC t = -1^ 

= 6j-y&, 


C T -=j,J ? T - 


combining all of these gives us the required result. 
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6.3.2 Derivation of ( |2l| ) 


Let ° ({if} T<r<r ) and % z j-\ — & ({C r } T < ?<:r ) . Note that heuristically the 

Bayes’ Theorem implies 

P(C t _ = j |^r,C T = k) = P(C T _ = j \J? T ,& {TJ] ,C T = k) 

/dP\ 


«^(T,r] k,C T -—j 


dP 


P(C T _ = j |#r,C T = k). 




Since C (each T, = £ v ez\{0} cf^), the Markov property of C r implies 


/dP\ _/dP\ 

\ d QJ jr (rr] \j? T ,c T =k,c T -=j V d< Q/jr (i;r] |^,c,=k ’ 

because given the current information C T the information in the past C T is irrele¬ 
vant. This then proves that 

P(C T _=j|^r,C T = k)=P(C T _=j|^ T ,C T = k). 


6.4 Proof of Theorem [ 4 ] 

Since each C/' f is independent for different y, the complete-data log-likelihood can 
be written as 


%(0)= E W)(°)+ £ z c w( 0 )’ 

yez\{0} r 0 3>6Z\{0} 0 


where we recall that * is the natural filtration generated by cj y \ 


|C« W = 0< E r ( lo g(v(y))!{ AC 6-) =1 } +log (*&) 

-f (v(y) + 0C f W )dr 

Jte(o,r] \ / 

= log (v (y))4 V) ’ A - V (y) T + log (0)4 V) ’ D - 0 / C^dr, 

Jte(o,T] 

where the first equality follows directly from Theorem [T] (ignoring the constant), 
and 

l c M (0) = Cq ] (log V (y) - log 0 ) - 
C 0 


0 
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because of Cq ^ ^ Poisson (v (y) /</>). Thus, collecting terms will give us the required 
result The derivations of the MCLE are elementary. 

Let 

IM|= / v(dy) = £v(y). 

J y=l 

The ergodicity of D t - implies that as T —> °° 


1 

T 



JM 

6 


Since —— 
T 


Nj 


—► || v||, we have 


~T 


Nj 

~T~ 


Do + T 1 J te (o, T ]Dt-<h 
T 


—► || V||, too. 


Thus, 


0MCLE — 



y-) +4r-i^±^r-i/ (efa71 A-* 


27'- 1 /, 


re (0,7’] 


Z) r _dr 


—»■ 


II v l! + v ll v ll 2 +o 



<t>- 


Finally, for any y G Z\ {0}, 



V (y) and 0 MCLE ^ 0 


< °°, so we easily have 


Vmcle (y) 





—> v (y) as well. 


6.5 Proof of Proposition [ 7 ] 


As C,'' > 0, ( |19| implies that 


<T > = sup (a/ y) - ) , y = 1,2,..., 

re[o,r] v 


where we set = 0 conventionally. Now 
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r (y) _ 

( -0 “ 


ro-L/fr/c^ 


< 


Yo-L y¥y yc$’ L 


= c 


W.u 


0 ,T 


so we have 




^ s^(y) ^ r^(y) T 

'-o,r — '-o — l 2 3 4 5 -o,r • 


Let Nf ' ’ be the counting process of —y jumps resulted from the departures 
of those initial events of size y that constitute Cq\ Let t be the time when N^^’* 
achieve ci y '. Then we have 


r (y), L _ r (y) 

'-0 ,T '-0. 


^’ L V sup (n{ y) ’* - (n^ - (n$ y) -Nj- y) ’*))) 


= CM’ L V I CY ; - inf 


"0.T 


te(r,T] 


(y) 


o 


te(t,T] 




Observe that N, yl — (jvj v,i — Jv/ is a M/G/°° queue initiated at state 0, so by 
the ergodicity we must have with probability 1 


lim inf 

7’->°°re(T,r] 


( 'n f W - y) — Nf y) ’*)) =0. 


This then shows that actually 

lim c w,L — — r (y) 

rGi L '°T — l '°’ t v l '° — L, o > 

where the last equality follows because Cq V ^ L < C^\ Correspondingly, 


hm cg.' U = 

T —^°° ’ 


yp - 

y 


w 


_ /-M 

— n • 
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